Arm backend: Wire examples/arm/run.sh for cortex-m55+int8 e2e testing by rascani · Pull Request #19073 · pytorch/executorch

rascani · 2026-04-23T17:27:39Z

Summary

Adds cortex-m55+int8 target support to the public Arm runner scripts so the bundled-IO FVP test flow is reachable from examples/arm/run.sh instead of being inlined in .ci/scripts/test_cortex_m_e2e.sh.

backends/arm/scripts/run_fvp.sh now accepts --bundle= and, when the target matches cortex-m, launches the Corstone-300 FVP with semihosting enabled and the cortex-m test runner as the loaded ELF, then greps the FVP log for the Test_result: PASS/FAIL line emitted by the bundled-IO runner. examples/arm/run.sh detects cortex-m targets, force-disables --delegate (the cortex-m backend is an operator library, not a delegate), invokes backends/cortex_m/test/build_test_runner.sh for the shared semihosting runner, and delegates the FVP launch to run_fvp.sh.

.ci/scripts/test_cortex_m_e2e.sh becomes a thin wrapper over run.sh so CI and user-facing docs exercise the same code path. The redundant build_test_runner.sh invocation in trunk.yml is removed since run.sh now calls it internally.

This change was authored with Claude (claude-opus-4-7[1m]).

Test plan

# Step 1 — Primary e2e via run.sh (expect Test_result: PASS and exit 0):
examples/arm/run.sh --model_name=add --target=cortex-m55+int8 --quantize --bundleio

# Step 2 — Second model (mv2):
examples/arm/run.sh --model_name=mv2 --target=cortex-m55+int8 --quantize --bundleio

# Step 3 — CI-wrapper parity (test_cortex_m_e2e.sh):
bash .ci/scripts/test_cortex_m_e2e.sh add

# Step 4 — Negative path (missing --bundleio): Expect exit 1 with error
examples/arm/run.sh --model_name=add --target=cortex-m55+int8 --quantize

# Step 5 — Regression guard for the edited run_fvp.sh ethos-u55 path (expect model to run on FVP and exit 0):
examples/arm/run.sh --model_name=add --target=ethos-u55-128 --quantize

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell

Adds cortex-m55+int8 target support to the public Arm runner scripts so the bundled-IO FVP test flow is reachable from examples/arm/run.sh instead of being inlined in .ci/scripts/test_cortex_m_e2e.sh. backends/arm/scripts/run_fvp.sh now accepts --bundle=<bpte> and, when the target matches cortex-m, launches the Corstone-300 FVP with semihosting enabled and the cortex-m test runner as the loaded ELF, then greps the FVP log for the Test_result: PASS/FAIL line emitted by the bundled-IO runner. examples/arm/run.sh detects cortex-m targets, force-disables --delegate (the cortex-m backend is an operator library, not a delegate), invokes backends/cortex_m/test/build_test_runner.sh for the shared semihosting runner, and delegates the FVP launch to run_fvp.sh. .ci/scripts/test_cortex_m_e2e.sh becomes a thin wrapper over run.sh so CI and user-facing docs exercise the same code path. The redundant build_test_runner.sh invocation in trunk.yml is removed since run.sh now calls it internally. This change was authored with Claude (claude-opus-4-7[1m]).

pytorch-bot · 2026-04-23T17:27:43Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19073

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 3 Cancelled Jobs, 5 Unrelated Failures

As of commit 6a25889 with merge base e4ab34d ():

NEW FAILURE - The following job has failed:

Windows MSVC Build / build-windows-msvc / windows-job (gh)

CANCELLED JOBS - The following jobs were cancelled. Please retry:

pull / test-sqnr-static-llm-qnn-linux (smollm2_135m) / linux-job (gh)
##[error]The operation was canceled.
pull / test-voxtral-realtime-xnnpack-linux / linux-job (gh)
##[error]The operation was canceled.
trunk / test-llama-runner-qnn-linux (fp32, qnn_16a16w, qnn) / linux-job (gh)
##[error]The operation was canceled.

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

periodic / test-models-linux (cmake, mv3, portable, linux.2xlarge, 90) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)
pull / unittest / macos / macos-job (gh) (similar failure)
##[error]The operation was canceled.
trunk / test-llama-runner-linux (fp32, xnnpack+custom, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest-editable / macos / macos-job (gh) (trunk failure)
##[error]The operation was canceled.
trunk / unittest-release / macos / macos-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-04-23T17:28:28Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

psiddh · 2026-04-27T22:30:44Z

@claude review

claude · 2026-04-27T22:31:33Z

Claude finished @psiddh's task in 3m 20s —— View job

Review: Arm backend cortex-m55+int8 e2e wiring

Gather context and read changed files
Review diff against origin/main
Provide review feedback

Overall this is a well-structured PR. Consolidating the e2e flow into run.sh so CI and user-facing docs exercise the same code path is a clear improvement. The cortex-m branch in run_fvp.sh is clean and the early-exit with its own PASS/FAIL checking makes sense since etdump/generic-error scraping doesn't apply to the semihosting flow.

A few items worth addressing:

1. --quantize is silently ignored by run.sh (medium)

test_cortex_m_e2e.sh:23 passes --quantize to run.sh, and the PR's test plan also uses it. However, run.sh's arg parser (lines 89–122) has no --quantize case — only --no_quantize. The unrecognized flag falls through to *) ;; and is silently swallowed. This works because quantize is the default, but it's misleading: a reader would assume --quantize is a real flag. If someone later changes the default, this would silently break.

Suggestion: either drop --quantize from the CI wrapper (since it's the default), or add --quantize as a recognized no-op/explicit flag in run.sh's parser.

Fix this →

2. Dummy-input file coupling between run.sh and run_fvp.sh (minor)

run.sh:370 creates fvp_dummy_input.bin next to the .bpte, and run_fvp.sh:117–118 asserts its existence. This is a contract across two scripts that isn't enforced by anything other than comments and ordering. If a future caller invokes run_fvp.sh --bundle=... without knowing about the dummy file requirement, they get a somewhat cryptic error.

Consider having run_fvp.sh create the dummy file itself when --bundle is specified — it has all the information needed (bundle_dir). This removes the cross-script coupling and makes run_fvp.sh self-contained for the cortex-m path.

3. $? in "Simulation complete" message is always 0 (nit, pre-existing)

At run_fvp.sh:133:

... | sed 's/\r$//' | tee ${log_file} || true
echo "[${BASH_SOURCE[0]}] Simulation complete, $?"

The || true ensures the pipeline always exits 0, so $? in the echo is always 0 — the FVP's actual exit code is lost. This is the same pattern used in the existing ethos-u55/u85 branches (lines 158, 171), so it's pre-existing, but it's worth noting since this branch does its own PASS/FAIL checking anyway and the message could be confusing during debugging.

4. check_setup() has no branch for cortex-m targets (minor)

run.sh:194–234: check_setup() handles ethos-u and vgf targets but has no explicit branch for cortex-m*. For cortex-m, it sources setup_path_script and returns 0 without toolchain checks or calling build_executorch.sh. This works because build_test_runner.sh (called at run.sh:365) internally invokes build_executorch.sh. But toolchain validation is skipped, so setup errors won't surface until deeper in the build. A cortex-m branch in check_setup (even just a toolchain hash check) would catch misconfigurations earlier.

5. FVP model selection uses cortex-m55* specifically (nit)

run_fvp.sh:58 matches cortex-m55* for FVP selection (Corstone-300), while line 108 uses the broader cortex-m* for the semihosting launch path. This is correct for now since only cortex-m55 is supported, but if cortex-m33 or other variants are added, line 58 would need updating while line 108 would already handle them. Worth a brief comment noting the intentional specificity, or matching on cortex-m* and selecting the FVP based on a more explicit mapping.

Summary: The PR cleanly consolidates the cortex-m e2e flow. Items 1 and 2 are the most actionable — the rest are nits or pre-existing patterns. Nice work reducing test_cortex_m_e2e.sh to a thin wrapper.

psiddh · 2026-04-29T20:01:11Z

Accepting it to unblock

psiddh · 2026-04-29T20:02:24Z

Looks like there is a merge conflict (--quantize is silently ignored by run.sh seems reasonable feedback from Claude)

# Conflicts: # .github/workflows/trunk.yml

Drops `--quantize` from .ci/scripts/test_cortex_m_e2e.sh — run.sh has no such flag (only --no_quantize), so it was being silently swallowed; the default behavior is to quantize, so the wrapper just needs to omit it. Moves the dummy-input file creation back into run_fvp.sh so the script is self-contained when given --bundle=. Previously run.sh created fvp_dummy_input.bin next to the .bpte and run_fvp.sh asserted its existence — a cross-script contract enforced only by ordering. With this change run_fvp.sh creates the placeholder itself; future callers of run_fvp.sh --bundle= no longer need to know about the requirement. Adds a cortex-m branch to run.sh's check_setup() so a missing arm-none-eabi-gcc toolchain surfaces immediately rather than deeper in the build_test_runner.sh chain. Adds a comment to run_fvp.sh's FVP-model selection noting that cortex-m55 specifically maps to the SSE-300 FVP because it's the only Cortex-M CPU on Corstone-300 today; cortex-m85 falls through to SSE-320. Removes the now-redundant `backends/cortex_m/test/build_test_runner.sh` step from the reusable cortex-m e2e workflow — run.sh invokes it internally as part of the cortex-m target branch. This change was authored with Claude (claude-opus-4-7[1m]).

rascani · 2026-04-30T16:27:26Z

Failures unrelated.

rascani requested a review from zingo April 23, 2026 17:27

rascani requested a review from digantdesai as a code owner April 23, 2026 17:27

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 23, 2026

github-actions Bot added ciflow/trunk module: arm Issues related to arm backend labels Apr 23, 2026

rascani requested review from AdrianLundell and psiddh April 23, 2026 17:29

Merge branch 'main' into cortex-m-run-sh-e2e

242b537

psiddh approved these changes Apr 29, 2026

View reviewed changes

rascani added 2 commits April 29, 2026 13:40

Merge remote-tracking branch 'origin/main' into cortex-m-run-sh-e2e

087e6c7

# Conflicts: # .github/workflows/trunk.yml

rascani merged commit 5be325a into pytorch:main Apr 30, 2026
563 of 584 checks passed

rascani deleted the cortex-m-run-sh-e2e branch April 30, 2026 16:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arm backend: Wire examples/arm/run.sh for cortex-m55+int8 e2e testing#19073

Arm backend: Wire examples/arm/run.sh for cortex-m55+int8 e2e testing#19073
rascani merged 4 commits into
pytorch:mainfrom
rascani:cortex-m-run-sh-e2e

rascani commented Apr 23, 2026 •

edited by pytorch-bot Bot

Loading

Uh oh!

pytorch-bot Bot commented Apr 23, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 23, 2026

Uh oh!

psiddh commented Apr 27, 2026

Uh oh!

claude Bot commented Apr 27, 2026 •

edited

Loading

Uh oh!

psiddh commented Apr 29, 2026

Uh oh!

psiddh commented Apr 29, 2026

Uh oh!

rascani commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rascani commented Apr 23, 2026 • edited by pytorch-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

pytorch-bot Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19073

❌ 1 New Failure, 3 Cancelled Jobs, 5 Unrelated Failures

Uh oh!

github-actions Bot commented Apr 23, 2026

This PR needs a release notes: label

Uh oh!

psiddh commented Apr 27, 2026

Uh oh!

claude Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: Arm backend cortex-m55+int8 e2e wiring

Uh oh!

psiddh commented Apr 29, 2026

Uh oh!

psiddh commented Apr 29, 2026

Uh oh!

rascani commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rascani commented Apr 23, 2026 •

edited by pytorch-bot Bot

Loading

pytorch-bot Bot commented Apr 23, 2026 •

edited

Loading

This PR needs a `release notes:` label

claude Bot commented Apr 27, 2026 •

edited

Loading